Cần Thơ
BoMGene: Integrating Boruta-mRMR feature selection for enhanced Gene expression classification
Phan, Bich-Chung, Ma, Thanh, Nguyen, Huu-Hoa, Do, Thanh-Nghi
Feature selection is a crucial step in analyzing gene expression data, enhancing classification performance, and reducing computational costs for high-dimensional datasets. This paper proposes BoMGene, a hybrid feature selection method that effectively integrates two popular techniques: Boruta and Minimum Redundancy Maximum Relevance (mRMR). The method aims to optimize the feature space and enhance classification accuracy. Experiments were conducted on 25 publicly available gene expression datasets, employing widely used classifiers such as Support Vector Machine (SVM), Random Forest, XGBoost (XGB), and Gradient Boosting Machine (GBM). The results show that using the Boruta-mRMR combination cuts down the number of features chosen compared to just using mRMR, which helps to speed up training time while keeping or even improving classification accuracy compared to using individual feature selection methods. The proposed approach demonstrates clear advantages in accuracy, stability, and practical applicability for multi-class gene expression data analysis
- Asia > Vietnam > Cần Thơ > Cần Thơ (0.05)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.90)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Hybrid Layer-Wise ANN-SNN With Surrogate Spike Encoding-Decoding Structure
Luu, Nhan T., Luu, Duong T., Nam, Pham Ngoc, Thang, Truong Cong
Spiking Neural Networks (SNNs) have gained significant traction in both computational neuroscience and artificial intelligence for their potential in energy-efficient computing. In contrast, artificial neural networks (ANNs) excel at gradient-based optimization and high accuracy. This contrast has consequently led to a growing subfield of hybrid ANN-SNN research. However, existing hybrid approaches often rely on either a strict separation between ANN and SNN components or employ SNN-only encoders followed by ANN classifiers due to the constraints of non-differentiability of spike encoding functions, causing prior hybrid architectures to lack deep layer-wise cooperation during backpropagation. To address this gap, we propose a novel hybrid ANN-SNN framework that integrates layer-wise encode-decode SNN blocks within conventional ANN pipelines. Central to our method is the use of surrogate gradients for a bit-plane-based spike encoding function, enabling end-to-end differentiable training across ANN and SNN layers. This design achieves competitive accuracy with state-of-the-art pure ANN and SNN models while retaining the potential efficiency and temporal representation benefits of spiking computation. To the best of our knowledge, this is the first implementation of a surrogate gradient for bit plane coding specifically and spike encoder interface in general to be utilized in the context of hybrid ANN-SNN, successfully leading to a new class of hybrid models that pave new directions for future research.
- Asia > Vietnam > Cần Thơ > Cần Thơ (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
Integrating LLM in Agent-Based Social Simulation: Opportunities and Challenges
Taillandier, Patrick, Zucker, Jean Daniel, Grignard, Arnaud, Gaudou, Benoit, Huynh, Nghi Quang, Drogoul, Alexis
This position paper examines the use of Large Language Models (LLMs) in social simulation, analyzing both their potential and their limitations from a computational social science perspective. The first part reviews recent findings on the ability of LLMs to replicate key aspects of human cognition, including Theory of Mind reasoning and social inference, while also highlighting significant limitations such as cognitive biases, lack of true understanding, and inconsistencies in behavior. The second part surveys emerging applications of LLMs in multi-agent simulation frameworks, focusing on system architectures, scale, and validation strategies. Notable projects such as Generative Agents (Smallville) and AgentSociety are discussed in terms of their design choices, empirical grounding, and methodological innovations. Particular attention is given to the challenges of behavioral fidelity, calibration, and reproducibility in large-scale LLM-driven simulations. The final section distinguishes between contexts where LLMs, like other black-box systems, offer direct value-such as interactive simulations and serious games-and those where their use is more problematic, notably in explanatory or predictive modeling. The paper concludes by advocating for hybrid approaches that integrate LLMs into traditional agent-based modeling platforms (GAMA, Netlogo, etc), enabling modelers to combine the expressive flexibility of language-based reasoning with the transparency and analytical rigor of classical rule-based systems.
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- Pacific Ocean > North Pacific Ocean > Puget Sound (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Education (1.00)
- Health & Medicine > Therapeutic Area (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.73)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Coupling Agent-Based Simulations and VR universes: the case of GAMA and Unity
Drogoul, Alexis, Taillandier, Patrick, Brugière, Arthur, Martinez, Louis, Sillano, Léon, Lesquoy, Baptiste, Nghi, Huynh Quang
Agent-based models (ABMs) and video games, including those taking advantage of virtual reality (VR), have undergone a remarkable parallel evolution, achieving impressive levels of complexity and sophistication. This paper argues that while ABMs prioritize scientific analysis and understanding and VR aims for immersive entertainment, they both simulate artificial worlds and can benefit from closer integration. Coupling both approaches indeed opens interesting possibilities for research and development in various fields, and in particular education, at the heart of the SIMPLE project, an EU-funded project on the development of digital tools for awareness raising on environmental issues. However, existing tools often present limitations, including technical complexity, limited functionalities, and lack of interoperability. To address these challenges, we introduce a novel framework for linking GAMA, a popular ABM platform, with Unity, a widely used game engine. This framework enables seamless data exchange, real-time visualization, and user interaction within VR environments, allowing researchers to leverage the strengths of both ABMs and VR for more impactful and engaging simulations. We demonstrate the capabilities of our framework through two prototypes built to highlight its potential in representing and interacting with complex socio-environmental system models. We conclude by emphasizing the importance of continued collaboration between the ABM and VR communities to develop robust, user-friendly tools, paving the way for a new era of collaborative research and immersive experiences in simulations.
- Asia > Vietnam > Hanoi > Hanoi (0.05)
- South America > Brazil (0.04)
- Asia > Vietnam > Hanoi > Hoàn Kiếm District, Hanoi (0.04)
- (8 more...)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Health & Medicine (1.00)
- Education (1.00)
An approach to hummed-tune and song sequences matching
Pham, Loc Bao, Luong, Huong Hoang, Tran, Phu Thien, Ngo, Phuc Hoang, Nguyen, Vi Hoang, Nguyen, Thinh
Melody stuck in your head, also known as "earworm", is tough to get rid of, unless you listen to it again or sing it out loud. But what if you can not find the name of that song? It must be an intolerable feeling. Recognizing a song name base on humming sound is not an easy task for a human being and should be done by machines. However, there is no research paper published about hum tune recognition. Adapting from Hum2Song Zalo AI Challenge 2021 - a competition about querying the name of a song by user's giving humming tune, which is similar to Google's Hum to Search. This paper covers details about the pre-processed data from the original type (mp3) to usable form for training and inference. In training an embedding model for the feature extraction phase, we ran experiments with some states of the art, such as ResNet, VGG, AlexNet, MobileNetV2. And for the inference phase, we use the Faiss module to effectively search for a song that matched the sequence of humming sound. The result comes at nearly 94\% in MRR@10 metric on the public test set, along with the top 1 result on the public leaderboard.
- Media > Music (0.70)
- Leisure & Entertainment (0.70)
Learning Algorithms Made Simple
Golilarz, Noorbakhsh Amiri, Hossain, Elias, Addeh, Abdoljalil, Rahimi, Keyan Alexander
In this paper, we discuss learning algorithms and their importance in different types of applications which includes training to identify important patterns and features in a straightforward, easy-to-understand manner. We will review the main concepts of artificial intelligence (AI), machine learning (ML), deep learning (DL), and hybrid models. Some important subsets of Machine Learning algorithms such as supervised, unsupervised, and reinforcement learning are also discussed in this paper. These techniques can be used for some important tasks like prediction, classification, and segmentation. Convolutional Neural Networks (CNNs) are used for image and video processing and many more applications. We dive into the architecture of CNNs and how to integrate CNNs with ML algorithms to build hybrid models. This paper explores the vulnerability of learning algorithms to noise, leading to misclassification. We further discuss the integration of learning algorithms with Large Language Models (LLM) to generate coherent responses applicable to many domains such as healthcare, marketing, and finance by learning important patterns from large volumes of data. Furthermore, we discuss the next generation of learning algorithms and how we may have an unified Adaptive and Dynamic Network to perform important tasks. Overall, this article provides brief overview of learning algorithms, exploring their current state, applications and future direction.
- North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.14)
- North America > United States > Mississippi (0.04)
- Asia > Taiwan (0.04)
- (5 more...)
- Overview (1.00)
- Research Report > Promising Solution (0.46)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area (0.68)
- Education > Curriculum > Subject-Specific Education (0.46)
Evolutionary Multi-Objective Optimisation for Fairness-Aware Self Adjusting Memory Classifiers in Data Streams
Amarasinghe, Pivithuru Thejan, Pham, Diem, Tran, Binh, Nguyen, Su, Sun, Yuan, Alahakoon, Damminda
This paper introduces a novel approach, evolutionary multi-objective optimisation for fairness-aware self-adjusting memory classifiers, designed to enhance fairness in machine learning algorithms applied to data stream classification. With the growing concern over discrimination in algorithmic decision-making, particularly in dynamic data stream environments, there is a need for methods that ensure fair treatment of individuals across sensitive attributes like race or gender. The proposed approach addresses this challenge by integrating the strengths of the self-adjusting memory K-Nearest-Neighbour algorithm with evolutionary multi-objective optimisation. This combination allows the new approach to efficiently manage concept drift in streaming data and leverage the flexibility of evolutionary multi-objective optimisation to maximise accuracy and minimise discrimination simultaneously. We demonstrate the effectiveness of the proposed approach through extensive experiments on various datasets, comparing its performance against several baseline methods in terms of accuracy and fairness metrics. Our results show that the proposed approach maintains competitive accuracy and significantly reduces discrimination, highlighting its potential as a robust solution for fairness-aware data stream classification. Further analyses also confirm the effectiveness of the strategies to trigger evolutionary multi-objective optimisation and adapt classifiers in the proposed approach.
- Oceania > Australia > Victoria > Melbourne (0.04)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- Asia > Vietnam > Cần Thơ > Cần Thơ (0.04)
- Overview (1.00)
- Research Report > New Finding (0.86)
- Research Report > Promising Solution (0.66)
- Health & Medicine (0.93)
- Education > Educational Setting > Online (0.68)
Cross-Data Knowledge Graph Construction for LLM-enabled Educational Question-Answering System: A~Case~Study~at~HCMUT
Bui, Tuan, Tran, Oanh, Nguyen, Phuong, Ho, Bao, Nguyen, Long, Bui, Thang, Quan, Tho
In today's rapidly evolving landscape of Artificial Intelligence, large language models (LLMs) have emerged as a vibrant research topic. LLMs find applications in various fields and contribute significantly. Despite their powerful language capabilities, similar to pre-trained language models (PLMs), LLMs still face challenges in remembering events, incorporating new information, and addressing domain-specific issues or hallucinations. To overcome these limitations, researchers have proposed Retrieval-Augmented Generation (RAG) techniques, some others have proposed the integration of LLMs with Knowledge Graphs (KGs) to provide factual context, thereby improving performance and delivering more accurate feedback to user queries. Education plays a crucial role in human development and progress. With the technology transformation, traditional education is being replaced by digital or blended education. Therefore, educational data in the digital environment is increasing day by day. Data in higher education institutions are diverse, comprising various sources such as unstructured/structured text, relational databases, web/app-based API access, etc. Constructing a Knowledge Graph from these cross-data sources is not a simple task. This article proposes a method for automatically constructing a Knowledge Graph from multiple data sources and discusses some initial applications (experimental trials) of KG in conjunction with LLMs for question-answering tasks.
- Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.05)
- Asia > Thailand > Phuket > Phuket (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (5 more...)
DeforestVis: Behavior Analysis of Machine Learning Models with Surrogate Decision Stumps
Chatzimparmpas, Angelos, Martins, Rafael M., Telea, Alexandru C., Kerren, Andreas
As the complexity of machine learning (ML) models increases and their application in different (and critical) domains grows, there is a strong demand for more interpretable and trustworthy ML. A direct, model-agnostic, way to interpret such models is to train surrogate models, such as rule sets and decision trees, that sufficiently approximate the original ones while being simpler and easier-to-explain. Yet, rule sets can become very lengthy, with many if-else statements, and decision tree depth grows rapidly when accurately emulating complex ML models. In such cases, both approaches can fail to meet their core goal, providing users with model interpretability. To tackle this, we propose DeforestVis, a visual analytics tool that offers user-friendly summarization of the behavior of complex ML models by providing surrogate decision stumps (one-level decision trees) generated with the adaptive boosting (AdaBoost) technique. DeforestVis helps users to explore the complexity vs fidelity trade-off by incrementally generating more stumps, creating attribute-based explanations with weighted stumps to justify decision making, and analyzing the impact of rule overriding on training instance allocation between one or more stumps. An independent test set allows users to monitor the effectiveness of manual rule changes and form hypotheses based on case-by-case analyses. We show the applicability and usefulness of DeforestVis with two use cases and expert interviews with data analysts and model developers.
- North America > United States > Wisconsin (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- (3 more...)
- Research Report (1.00)
- Personal > Interview (0.66)
On the Relationship Between RNN Hidden State Vectors and Semantic Ground Truth
Muškardin, Edi, Tappler, Martin, Pill, Ingo, Aichernig, Bernhard K., Pock, Thomas
We examine the assumption that the hidden-state vectors of recurrent neural networks (RNNs) tend to form clusters of semantically similar vectors, which we dub the clustering hypothesis. While this hypothesis has been assumed in the analysis of RNNs in recent years, its validity has not been studied thoroughly on modern neural network architectures. We examine the clustering hypothesis in the context of RNNs that were trained to recognize regular languages. This enables us to draw on perfect ground-truth automata in our evaluation, against which we can compare the RNN's accuracy and the distribution of the hidden-state vectors. We start with examining the (piecewise linear) separability of an RNN's hidden-state vectors into semantically different classes. We continue the analysis by computing clusters over the hidden-state vector space with multiple state-of-the-art unsupervised clustering approaches. We formally analyze the accuracy of computed clustering functions and the validity of the clustering hypothesis by determining whether clusters group semantically similar vectors to the same state in the ground-truth model. Our evaluation supports the validity of the clustering hypothesis in the majority of examined cases. We observed that the hidden-state vectors of well-trained RNNs are separable, and that the unsupervised clustering techniques succeed in finding clusters of similar state vectors.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > Austria > Styria > Graz (0.04)
- North America > United States > Colorado > Denver County > Denver (0.04)
- (10 more...)